{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c8df99d7",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "# 自动微分\n",
    "\n",
    "假设我们想对函数$y=2\\mathbf{x}^{\\top}\\mathbf{x}$关于列向量$\\mathbf{x}$求导"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "498c3ed5",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:09.704166Z",
     "iopub.status.busy": "2022-12-07T16:37:09.703482Z",
     "iopub.status.idle": "2022-12-07T16:37:10.849947Z",
     "shell.execute_reply": "2022-12-07T16:37:10.848845Z"
    },
    "origin_pos": 2,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([0., 1., 2., 3.])"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "x = torch.arange(4.0)\n",
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45f07be6",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "在我们计算$y$关于$\\mathbf{x}$的梯度之前，需要一个地方来存储梯度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "f8f4026b",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.853801Z",
     "iopub.status.busy": "2022-12-07T16:37:10.853279Z",
     "iopub.status.idle": "2022-12-07T16:37:10.858081Z",
     "shell.execute_reply": "2022-12-07T16:37:10.857050Z"
    },
    "origin_pos": 7,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [],
   "source": [
    "x.requires_grad_(True)\n",
    "x.grad"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68b2657a",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "现在计算$y$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "eeb79797",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.861469Z",
     "iopub.status.busy": "2022-12-07T16:37:10.861031Z",
     "iopub.status.idle": "2022-12-07T16:37:10.867917Z",
     "shell.execute_reply": "2022-12-07T16:37:10.867083Z"
    },
    "origin_pos": 12,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor(28., grad_fn=<MulBackward0>)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = 2 * torch.dot(x, x)\n",
    "y"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d978b6da",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "通过调用反向传播函数来自动计算`y`关于`x`每个分量的梯度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "b21e1420",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.872410Z",
     "iopub.status.busy": "2022-12-07T16:37:10.871742Z",
     "iopub.status.idle": "2022-12-07T16:37:10.877373Z",
     "shell.execute_reply": "2022-12-07T16:37:10.876674Z"
    },
    "origin_pos": 17,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([ 0.,  4.,  8., 12.])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y.backward()\n",
    "x.grad"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "f72da1f0",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.880808Z",
     "iopub.status.busy": "2022-12-07T16:37:10.880315Z",
     "iopub.status.idle": "2022-12-07T16:37:10.887007Z",
     "shell.execute_reply": "2022-12-07T16:37:10.886236Z"
    },
    "origin_pos": 22,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([True, True, True, True])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.grad == 4 * x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ecff87af",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "现在计算`x`的另一个函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e18e98cc",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.890349Z",
     "iopub.status.busy": "2022-12-07T16:37:10.889853Z",
     "iopub.status.idle": "2022-12-07T16:37:10.895860Z",
     "shell.execute_reply": "2022-12-07T16:37:10.895107Z"
    },
    "origin_pos": 27,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([1., 1., 1., 1.])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.grad.zero_()\n",
    "y = x.sum()\n",
    "y.backward()\n",
    "x.grad"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c7e92e43",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "深度学习中\n",
    "，我们的目的不是计算微分矩阵，而是单独计算批量中每个样本的偏导数之和"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "de02a306",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.899324Z",
     "iopub.status.busy": "2022-12-07T16:37:10.898840Z",
     "iopub.status.idle": "2022-12-07T16:37:10.904849Z",
     "shell.execute_reply": "2022-12-07T16:37:10.904154Z"
    },
    "origin_pos": 32,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([0., 2., 4., 6.])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.grad.zero_()\n",
    "y = x * x\n",
    "y.sum().backward()\n",
    "x.grad"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4373b64",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "将某些计算移动到记录的计算图之外"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "a0d0a164",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.908385Z",
     "iopub.status.busy": "2022-12-07T16:37:10.907775Z",
     "iopub.status.idle": "2022-12-07T16:37:10.915808Z",
     "shell.execute_reply": "2022-12-07T16:37:10.914379Z"
    },
    "origin_pos": 37,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([True, True, True, True])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.grad.zero_()\n",
    "y = x * x\n",
    "u = y.detach()\n",
    "z = u * x\n",
    "\n",
    "z.sum().backward()\n",
    "x.grad == u"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "16899140",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.920202Z",
     "iopub.status.busy": "2022-12-07T16:37:10.919752Z",
     "iopub.status.idle": "2022-12-07T16:37:10.927012Z",
     "shell.execute_reply": "2022-12-07T16:37:10.925890Z"
    },
    "origin_pos": 42,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([True, True, True, True])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.grad.zero_()\n",
    "y.sum().backward()\n",
    "x.grad == 2 * x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fbdc3293",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "即使构建函数的计算图需要通过Python控制流（例如，条件、循环或任意函数调用），我们仍然可以计算得到的变量的梯度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "cd3e9d01",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2022-12-07T16:37:10.949650Z",
     "iopub.status.busy": "2022-12-07T16:37:10.949215Z",
     "iopub.status.idle": "2022-12-07T16:37:10.956034Z",
     "shell.execute_reply": "2022-12-07T16:37:10.954996Z"
    },
    "origin_pos": 57,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor(True)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def f(a):\n",
    "    b = a * 2\n",
    "    while b.norm() < 1000:\n",
    "        b = b * 2\n",
    "    if b.sum() > 0:\n",
    "        c = b\n",
    "    else:\n",
    "        c = 100 * b\n",
    "    return c\n",
    "\n",
    "a = torch.randn(size=(), requires_grad=True)\n",
    "d = f(a)\n",
    "d.backward()\n",
    "\n",
    "a.grad == d / a"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "language_info": {
   "name": "python"
  },
  "rise": {
   "autolaunch": true,
   "enable_chalkboard": true,
   "overlay": "<div class='my-top-right'><img height=80px src='http://d2l.ai/_static/logo-with-text.png'/></div><div class='my-top-left'></div>",
   "scroll": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}